Software fault detection and recovery in critical real-time systems: An approach based on loose coupling

نویسندگان

  • Pekka Alho
  • Jouni Mattila
چکیده

Remote handling (RH) systems are used to inspect, make changes to, and maintain components in the ITER machine and as such are an example of mission-critical system. Failure in a critical system may cause damage, significant financial losses and loss of experiment runtime, making dependability one of their most important properties. However, even if the software for RH control systems has been developed using best practices, the system might still fail due to undetected faults (bugs), hardware failures, etc. Critical systems therefore need capability to tolerate faults and resume operation after their occurrence. However, design of effective fault detection and recovery mechanisms poses a challenge due to timeliness requirements, growth in scale, and complex interactions. In this paper we evaluate effectiveness of service-oriented architectural approach to fault tolerance in mission-critical real-time systems. We use a prototype implementation for service management with an experimental RH control system and industrial manipulator. The fault tolerance is based on using the high level of decoupling between services to recover from transient faults by service restarts. In case the recovery process is not successful, the system can still be used if the fault was not in a critical software module.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Fault Detection and Isolation Method Based on Belief Rule Base for Industrial Gas Turbines

Real time and accurate fault detection has attracted an increasing attention with a growing demand for higher operational efficiency and safety of industrial gas turbines as complex engineering systems. Current methods based on condition monitoring data have drawbacks in using both expert knowledge and quantitative information for detecting faults. On account of this reason, this paper proposes...

متن کامل

Software Fault Tolerance in Computer Operating Systems

This chapter provides, data and analysis of the dependability and fault tolerance for three operating systems: the Tandem/GUARDIAN fault-tolerant system, the VAX/VMS distributed system, and the IBM/MVS system. Based on measurements from these systems, basic software error characteristics are investigated. Fault tolerance in operating systems resulting from the use of process pairs and recovery ...

متن کامل

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

A Novel Intelligent Fault Diagnosis Approach for Critical Rotating Machinery in the Time-frequency Domain

The rotating machinery is a common class of machinery in the industry. The root cause of faults in the rotating machinery is often faulty rolling element bearings. This paper presents a novel technique using artificial neural network learning for automated diagnosis of localized faults in rolling element bearings. The inputs of this technique are a number of features (harmmean and median), whic...

متن کامل

Design of nonlinear parity approach to fault detection and identification based on Takagi-Sugeno fuzzy model and unknown input observer in nonlinear systems

In this study, a novel fault detection scheme is developed for a class of nonlinear system in the presence of sensor noise. A nonlinear Takagi-Sugeno fuzzy model is implemented to create multiple models. While the T-S fuzzy model is used for only the nonlinear distribution matrix of the fault and measurement signals, a larger category of nonlinear systems is considered. Next, a mapping to decou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014